The loan data is from Prosper, This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information. The last date updated this data set is 3/11/2014.
## [1] 81
## [1] "ListingKey"
## [2] "ListingNumber"
## [3] "ListingCreationDate"
## [4] "CreditGrade"
## [5] "Term"
## [6] "LoanStatus"
## [7] "ClosedDate"
## [8] "BorrowerAPR"
## [9] "BorrowerRate"
## [10] "LenderYield"
## [11] "EstimatedEffectiveYield"
## [12] "EstimatedLoss"
## [13] "EstimatedReturn"
## [14] "ProsperRating..numeric."
## [15] "ProsperRating..Alpha."
## [16] "ProsperScore"
## [17] "ListingCategory..numeric."
## [18] "BorrowerState"
## [19] "Occupation"
## [20] "EmploymentStatus"
## [21] "EmploymentStatusDuration"
## [22] "IsBorrowerHomeowner"
## [23] "CurrentlyInGroup"
## [24] "GroupKey"
## [25] "DateCreditPulled"
## [26] "CreditScoreRangeLower"
## [27] "CreditScoreRangeUpper"
## [28] "FirstRecordedCreditLine"
## [29] "CurrentCreditLines"
## [30] "OpenCreditLines"
## [31] "TotalCreditLinespast7years"
## [32] "OpenRevolvingAccounts"
## [33] "OpenRevolvingMonthlyPayment"
## [34] "InquiriesLast6Months"
## [35] "TotalInquiries"
## [36] "CurrentDelinquencies"
## [37] "AmountDelinquent"
## [38] "DelinquenciesLast7Years"
## [39] "PublicRecordsLast10Years"
## [40] "PublicRecordsLast12Months"
## [41] "RevolvingCreditBalance"
## [42] "BankcardUtilization"
## [43] "AvailableBankcardCredit"
## [44] "TotalTrades"
## [45] "TradesNeverDelinquent..percentage."
## [46] "TradesOpenedLast6Months"
## [47] "DebtToIncomeRatio"
## [48] "IncomeRange"
## [49] "IncomeVerifiable"
## [50] "StatedMonthlyIncome"
## [51] "LoanKey"
## [52] "TotalProsperLoans"
## [53] "TotalProsperPaymentsBilled"
## [54] "OnTimeProsperPayments"
## [55] "ProsperPaymentsLessThanOneMonthLate"
## [56] "ProsperPaymentsOneMonthPlusLate"
## [57] "ProsperPrincipalBorrowed"
## [58] "ProsperPrincipalOutstanding"
## [59] "ScorexChangeAtTimeOfListing"
## [60] "LoanCurrentDaysDelinquent"
## [61] "LoanFirstDefaultedCycleNumber"
## [62] "LoanMonthsSinceOrigination"
## [63] "LoanNumber"
## [64] "LoanOriginalAmount"
## [65] "LoanOriginationDate"
## [66] "LoanOriginationQuarter"
## [67] "MemberKey"
## [68] "MonthlyLoanPayment"
## [69] "LP_CustomerPayments"
## [70] "LP_CustomerPrincipalPayments"
## [71] "LP_InterestandFees"
## [72] "LP_ServiceFees"
## [73] "LP_CollectionFees"
## [74] "LP_GrossPrincipalLoss"
## [75] "LP_NetPrincipalLoss"
## [76] "LP_NonPrincipalRecoverypayments"
## [77] "PercentFunded"
## [78] "Recommendations"
## [79] "InvestmentFromFriendsCount"
## [80] "InvestmentFromFriendsAmount"
## [81] "Investors"
## ListingKey ListingNumber ListingCreationDate CreditGrade
## 139 11273541569159931E84F17 569000 22:33.4
## 180 0F1E35343868130956BD68F 544844 50:26.0
## Term LoanStatus ClosedDate BorrowerAPR BorrowerRate LenderYield
## 139 36 Defaulted 20/09/2012 0:00 0.33973 0.2999 0.2899
## 180 36 Defaulted 20/08/2012 0:00 0.34731 0.3073 0.2973
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## 139 0.2766 0.149 0.1276
## 180 0.2837 0.149 0.1347
## ProsperRating..numeric. ProsperRating..Alpha. ProsperScore
## 139 2 E 3
## 180 2 E 1
## ListingCategory..numeric. BorrowerState Occupation
## 139 6 KY Military Enlisted
## 180 2 MN Postal Service
## EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
## 139 Employed 126 TRUE
## 180 Employed 87 TRUE
## CurrentlyInGroup GroupKey DateCreditPulled CreditScoreRangeLower
## 139 FALSE 06/03/2012 11:00 620
## 180 FALSE 16/12/2011 3:50 660
## CreditScoreRangeUpper FirstRecordedCreditLine CurrentCreditLines
## 139 639 20/04/2001 0:00 7
## 180 679 11/03/1999 0:00 16
## OpenCreditLines TotalCreditLinespast7years OpenRevolvingAccounts
## 139 8 30 2
## 180 15 33 14
## OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
## 139 25 5 5
## 180 343 11 34
## CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
## 139 2 1890 23
## 180 1 0 0
## PublicRecordsLast10Years PublicRecordsLast12Months
## 139 0 0
## 180 0 0
## RevolvingCreditBalance BankcardUtilization AvailableBankcardCredit
## 139 72 0.07 928
## 180 4752 0.23 8306
## TotalTrades TradesNeverDelinquent..percentage. TradesOpenedLast6Months
## 139 27 0.75 1
## 180 32 0.96 5
## DebtToIncomeRatio IncomeRange IncomeVerifiable StatedMonthlyIncome
## 139 0.35 $25,000-49,999 TRUE 3750.000
## 180 0.13 $50,000-74,999 TRUE 4583.333
## LoanKey TotalProsperLoans TotalProsperPaymentsBilled
## 139 A6773646313973238A33299 1 3
## 180 44BC36372930801559159FD 1 2
## OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
## 139 3 0
## 180 1 1
## ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
## 139 0 2000
## 180 0 4500
## ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
## 139 0 -36
## 180 0 -17
## LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
## 139 121 6
## 180 170 8
## LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
## 139 24 62391 3000
## 180 27 57647 5500
## LoanOriginationDate LoanOriginationQuarter MemberKey
## 139 21/03/2012 0:00 Q1 2012 87C83528199783859742DC3
## 180 20/12/2011 0:00 Q4 2011 B64B35063311601836A9F9B
## MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
## 139 127.34 127.34 23.82
## 180 235.69 707.07 292.65
## LP_InterestandFees LP_ServiceFees LP_CollectionFees
## 139 103.52 -5.90 0
## 180 414.42 -13.48 0
## LP_GrossPrincipalLoss LP_NetPrincipalLoss
## 139 2976.18 0.00
## 180 5207.35 5207.35
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## 139 764.27 1 0
## 180 0.00 1 0
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## 139 0 0 31
## 180 0 0 45
The BorrowerState: The two letter abbreviation of the state of the address of the borrower at the time the Listing was created.
The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, Past Due (1-15 days), Past Due (16-30 days), Past Due (31-60 days), Past Due (61-90 days), and Past Due (91-120 days)
IncomeRange: The income range of the borrower at the time the listing was created.
Term: The length of the loan expressed in months.
ListingCategory: The category of the listing that the borrower selected when posting their listing: 0 - Not Available, 1 - Debt Consolidation, 2 - Home Improvement, 3 - Business, 4 - Personal Loan, 5 - Student Use, 6 - Auto, 7- Other, 8 - Baby&Adoption, 9 - Boat, 10 - Cosmetic Procedure, 11 - Engagement Ring, 12 - Green Loans, 13 - Household Expenses, 14 - Large Purchases, 15 - Medical/Dental, 16 - Motorcycle, 17 - RV, 18 - Taxes, 19 - Vacation, 20 - Wedding Loans
Recomendations: Number of recommendations the borrower had at the time the listing was created.
BorrowerAPR: The Borrower’s Annual Percentage Rate (APR) for the loan. An annual percentage rate (APR) is the annual rate charged for borrowing or earned through an investment. APR is expressed as a percentage that represents the actual yearly cost of funds over the term of a loan.
BrrowerRate: The Borrower’s interest rate for this loan. or intrest rate
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0799 0.2289 0.2925 0.2823 0.3473 0.4135
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0699 0.2005 0.2610 0.2507 0.3099 0.3600
TotalProsperLoans: Number of Prosper loans the borrower at the time they created this listing. This value will be null if the borrower had no prior loans.
StatedMonthlyIncome:The monthly income the borrower stated at the time the listing was created.
IsBorrowerHomeowner: A Borrower will be classified as a homowner if they have a mortgage on their credit profile or provide documentation confirming they are a homeowner.
the 53.38% of borrowers have a home and 46.62% haven’t home.
>MonthlyLoanPayment: The scheduled monthly loan payment.
LoanOriginationQuarter:The quarter in which the loan was originated.
IncomeRange: The income range of the borrower at the time the listing was created. The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, PastDue.
and The borrowers with medium income which is between (25,000 USD and 74,999 USD) have the highest loans and I think from my point of view this a large amount of the loan with their monthly income. The relation is when the borrower has high income can take loans and completed on time but when the borrower has low-income range may can’t complete the loan on time.
EmploymentStatus: The employment status of the borrower at the time they posted the listing.
AvailableBankcardCredit: The total available credit via bank card at the time the credit profile was pulled.
ProsperRating..numeric.:The Prosper Rating assigned at the time the listing was created: 0 - N/A, 1 - HR, 2 - E, 3 - D, 4 - C, 5 - B, 6 - A, 7 - AA. Applicable for loans originated after July 2009.
I was very interested in analyzing this dataset. The prosperLoanData is a dataset from Prosper, Prosper was founded in 2005 as the first peer-to-peer lending marketplace in the United States. Since then, Prosper has facilitated more than $14 billion in loans to more than 870,000 people. The prosperLoanData contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information. First I looked to the dataset using (str and summary functions) to get the structure and five number summary of the variables, then I read the variables definitions and some of the variables I searched to more information to explored it, the dataset was contained missing values that need to clean it. for the first section I install needed packages, libraries and remove missing data (NA’s), there are some bugs I faced when coding like when converting between string and numeric formats, convert from numeric to factor and to extract the dates and added it as three separate variables. In Univariate section investigate 13 variables out of 81 and to know more about these variables I plot each of them by visualization plots using (ggplot and geom layers), To remove repetitive codes I create functions that make coding easy. The second section is about the relationship between variables, for example, the relation between Stated monthly income and borrowers occupation. The last section is about the relationship between more than two variables to represent how these variables are related. Before this project, I didn’t know anything about loans of banks and how it works and this makes this project a little difficult to me, I spent many hours for searching about variables and watch videos about loans it was a challenge but I interested in exploring and analyzing this dataset.
17 .Prosper